SEPIA: Surface Span Extension to Syntactic Dependency Precision-based MT Evaluation
نویسندگان
چکیده
We present a new Machine Translation (MT) evaluation metric, SEPIA. SEPIA falls within the class of syntactically-aware evaluation metrics, which have been getting a lot of attention recently (Liu and Gildea, 2005; Owczarzak et al., 2007; Giménez and Màrquez, 2007). Specifically, SEPIA uses dependency representation but extends it to include surface span as a factor in the evaluation score. The dependency surface span is the surface distance between two words that are in a direct relationship in a dependency tree. The basic idea behind SEPIA is that long-distance dependencies should receive a greater weight in MT evaluation metrics than shortdistance dependencies. This is because we suspect that having more long-distance matches indicates a higher degree of grammaticality. In the rest of this document we describe the SEPIA metric and its variants, and the publicly available SEPIA package.
منابع مشابه
Dependency-Based Automatic Evaluation for Machine Translation
We present a novel method for evaluating the output of Machine Translation (MT), based on comparing the dependency structures of the translation and reference rather than their surface string forms. Our method uses a treebank-based, widecoverage, probabilistic Lexical-Functional Grammar (LFG) parser to produce a set of structural dependencies for each translation-reference sentence pair, and th...
متن کاملA New Syntactic Metric for Evaluation of Machine Translation
Machine translation (MT) evaluation aims at measuring the quality of a candidate translation by comparing it with a reference translation. This comparison can be performed on multiple levels: lexical, syntactic or semantic. In this paper, we propose a new syntactic metric for MT evaluation based on the comparison of the dependency structures of the reference and the candidate translations. The ...
متن کاملA Customizable MT Evaluation Metric for Assessing Adequacy Machine Translation Term Project
This project describes a customizable MT evaluation metric that provides system-dependent scores for the purposes of tuning an MT system. The features presented focus on assessing adequacy over uency. Rather than simply examining features, this project frames the MT evaluation task as a classi cation question to determine whether a given sentence was produced by a human or a machine. Support Ve...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملA Weakly-Supervised Rule-Based Approach for Relation Extraction
Resumen Rule-based approaches for information extraction usually achieve good precision values, even if they often need a lot of manual effort to be implemented. In this paper, we present a novel rule-based strategy for semantic relation extraction that takes advantage of partial syntactic parsing in order to simplify the linguistic structures containing instances of semantic relations. We also...
متن کامل